ANOVA and Regression: equivalent but not always equally useful

 

Advantages of regression (see also Keith, Mutiple Regression And Beyond 1E, p.17)

 

Regression easily accommodates both categorical and continuous IVs, whereas using continuous IVs in ANOVA (i.e., Analysis of Covariance "ANCOVA") is complicated and doesn't allow for potential interaction between continuous and categorical IVs.

In regression the effects of multiple IVs are interpretable even with four or more at once, whereas interpreting a four-way (or more) factorial ANOVA becomes unwieldy due to the many interactions (which wouldn't usually be included in the regression analysis).

Regression is equally appropriate for both experimental (manipulated) and non-experimental variables. This is true of ANOVA too and thus shouldn't count in favor of regression, though there is a tendency to casually assign cause-and-effect interpretations to ANOVA more so than to regression.

Regression naturally emphasizes effect size by including estimates like R2, b, and β rather than just p-values, and the linear model is made explicit. ANOVA has its effect size measures as well but the proportion-of-variance-explained type and the standardized-difference-between-means type are a little less intuitive and therefore more open to misinterpetation.

 

Advantages of ANOVA

 

ANOVA is probably more familiar to most people -- historically, but even today.

In ANOVA there's a simpler treatment of interaction effects (which after all may be included as a main focus of the research) and repeated measures effects, whereas regression would require mutiple additional columns of variables for product vectors and dummy coded subject vectors.

ANOVA is able to easily accommodate different error terms for different effects in complex factorial designs, whereas regression will use the model residual MSresidual as the error term for every effect or incremental R2 tested.

 

Interactions in ANOVA and regression

 

In ANOVA, interactions are all included and analyzed by default, which is appropriate for experimental designs in which interactions are expected, planned for, and made to happen by controlling experimental conditions; though even then they're mostly not significant, especially higher order interactions that aren't of theoretical interest.

In regression, interactions are only analyzed when they're deliberately included in an analysis; it's appropriate to omit them in nonexperimental field research measuring preexisting characteristics where interactions mostly don't occur or typically have little theoretical interest when they do. When an interaction is of interest, it's done in regression as a moderation analysis.


Reasons not to dichotomize a continuous variable to make it a categorical variable

(see also Pedhazur, Multiple Regression In Behavioral Research 3E, pp.574-7)

 

Treating continuous variables as categorical is often done only because the researcher is familiar with analyzing group differences and less familiar with techniques for analyzing continuous varables.

It results in loss of information and accuracy, since precise numerical measurements are thrown into a less precise batch of similar measurements; e.g., the IQ scale is reduced to "low and high" or maybe "low, medium, and high".

The "low-scorer" category actually ranges from low to average, and the "high-scorers" range from average to high, with nearly identical subjects in the middle of the range being labeled arbitrarily differently -- but all subjects within each group have the same category label and are considered to be different from all subjects in the other group

Multiple splits to identify the highest and lowest groups of scores result in loss of subjects from the excluded middle category of scores.

Focusing on, say, just the top and bottom 10% of a sample to identify the very high and very low scorers may wrongly imply linearity in the whole data set -- e.g., if it implicitly assumes that an intermediate group would have scores intermediate between the high and low groups without considering that the low group could be an unrepresentative drop-off in the tail of the distribution; it also throws away 80% of subjects.

Reducing the number of subjects of interest exacerbates subject loss for a two-phase evaluation, when for instance only 60% of subjects show up for the second phase; and group sizes may be initially equal but will probably be unequal for that second phase.

High and low dichotomized score groups are not randomly assigned to the high and low score categories, hence any conclusions about their group differences on other variables are confounded with unknown factors and can't be attributed to the group IV.

Grouping results in loss of degrees of freedom from the error term: A continuous variable takes up 1 degree of freedom for the numerator of the F ratio, while a group variable takes up [number of groups - 1] degrees of freedom, leaving fewer df for the denominator and reducing the likelihood of attaining statistical significance.

Dividing scores into into three groups is better than just two, and four is better than three, but with each additional group, more df come out of the error term. Four groups take up 3 df, compared to the 1 df for the continuous variable, so there are 2 fewer error df; 6 groups take up 5 df, so there are 4 fewer error df. Fewer error df make it harder to attain statistical significance.

Making groups out of a continuous IV is better only when the IV-DV relationship is non-linear, since the F statistic for the IV group variable includes both linear and nonlinear effects' variance. Regression measures only linear IV-DV relationships, so if most of an IV's relationship to the DV is nonlinear, ANOVA captures what regression would not. In regression, however, a nonlinear IV-DV relationship may be captured by adding a quadratic term (or even a cubic term) to the equation, especially if there are theoretical grounds for predicting that relationship.